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Abstract.  In  this  work,  we  give  an  overview  of  our  participation  in 
the  TREC  2014  Contextual  Suggestion  Track.  To  address  the  retrieval 
of  attraction  places,  we  propose  a  fuzzy-based  document  combination 
approach  for  preference  learning  and  context  processing.  We  use  the  open 
web  in  our  submission  and  make  use  of  both  criteria  users  preferences 
and  geographical  location  criteria. 


1  Introduction 

TREC3  2014  Contextual  Suggestion  track  examines  search  techniques  that  aim 
to  answer  complex  information  needs  that  are  highly  dependent  on  context  and 
user  interests.  Roughly  speaking,  given  a  user,  the  track  focuses  on  travel  sugges¬ 
tions  (e.g.,  attraction  places,  restaurants,  pizzeria,  etc)  based  on  two  dependent 
relevance  criteria:  (1)  users’  interests  which  consist  of  his  personal  preferences 
and  past  history;  (2)  his  geographical  location. 

For  the  sake  of  addressing  this  challenge,  we  used  the  open  web  to  search 
for  relevant  places  according  to  the  given  contexts  of  the  track.  This  track  was 
an  opportunity  for  us  to  test  a  previous  approach  on  multidimensional  (person¬ 
alized)  relevance  combination  [2,3].  The  aggregating  operator  proposed  here  is 
able  to  offer  insight  to  humans  about  why  some  relevance  criteria  were  weighted 
more  highly  than  other  ones  and  is  able  to  personalize  the  majority  preference 
regarding  the  IR  task  specificity  as  well  as  the  user  preferences. 

The  remainder  of  the  paper  is  organized  as  follows.  Section  2  describes  our 
approach  for  the  retrieval  of  contextual  suggestions.  In  Sections  3  and  4,  we 
describe  the  used  data,  the  experimental  setup  and  we  discuss  the  obtained 
results.  Section  5  concludes  the  paper. 

2  The  Contextual  Retrieval  Framework 

We  address  here  the  contextual  retrieval  problem  as  a  multi-criteria  decision 
making  (MCDM)  problem.  The  difficulty  here  is  to:  (i)  correctly  identifying 
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which  criterion  need  to  be  enhanced  vs.  weakened  regarding  both  user’s  pref¬ 
erence  and  context;  and  to  (ii)  accurately  combining  these  criteria.  We  rely  on 
the  Choquet  operator  as  an  aggregation  operator.  This  mathematical  function 
is  built  on  a  fuzzy  measure  yx,  defined  below. 

Let  Ic  be  the  set  of  all  possible  subsets  of  criteria  from  C. 

A  fuzzy  measure  is  a  normalized  monotone  function  /i  from  Ic  to  [0 . . .  1] 
such  that: 

V/c^/cs  6  Ic,  if  (Ici  C  Ic2)  then  ^(IcJ  <  m(^C2),  with  =  0  and 

n{Ic)  =  1-  n{lci)  will  be  denoted  by  The  value  of  can  be  interpreted 
as  the  importance  degree  of  the  interaction  between  the  criteria  involved  in 
the  subset  C\.  The  personalized  Choquet  integral  based- relevance  aggregation 
function  is  defined  as  follows: 

RSVc(q,  dj)  is  the  dj  document  personalized  relevance  score  for  user  u  w.r.t  the 
set  of  relevance  criteria  C  =  {ci,c2}  defined  as  follows: 

RSV£(q,  dj)  =  Ch^(RSV^1(q,  dj),  RSV^2(q,dj)) 

JL  (!) 

=  W}-(rsv“iw  -  rsv(i~Dj) 

i=  1 

Where  Ch M  is  the  Choquet  function,  rsv ^  is  the  ith  element  of  the  per¬ 
mutation  of  RSV(q,dj)  on  criterion  c,,  such  that  (0  <  rsv 5C.  <  ...  <  rsv^N^), 
n't  cjvj  is  the  importance  degree  of  the  set  of  criteria  {cj,  ...,cn}  for  user  u. 
In  this  way,  we  are  able  to  automatically  adjust  the  ranking  model’s  parameters 
for  each  user  and  make  results  dependent  on  its  preferences  over  the  considered 
criteria.  Note  that  if  fi  is  an  additive  measure,  the  Choquet  integral  corresponds 
to  the  weighted  mean. 

Considering  a  user,  the  typical  training  data  required  for  learning  the  criteria 
importance  for  each  user  requires  a  set  of  training  queries  and  for  each  query,  a 
list  of  ranked  documents  represented  by  pre-computed  vectors  containing  perfor¬ 
mance  scores;  where  each  document  is  annotated  with  a  rank  label  ( e.g relevant 
or  irrelevant).  The  general  methodology  is  detailed  in  figure  1.  As  most  of  re¬ 
quired  information  are  not  present  in  the  TREC  collection,  we  are  based  on  a 
basic  idea,  which  assign  all  users  the  same  preferences.  However,  based  on  an 
analysis  on  the  TREC  2013  Contextual  Suggestion  track  data  [3],  we  have  found 
that  for  most  contexts,  the  users  preferences  criterion  is  more  important  than 
localisation.  Therefore,  we  assign  importance  degrees  equals  to  0.7  and  0.3  for 
both  criteria  respectively. 


3  Data  Preparation  and  Indexing 

To  fetch  for  the  candidate  suggestion  places,  we  crawl  the  open  web  through 
the  Google  Place  API4.  As  for  most  of  the  TREC  Contextual  Suggestion  track 

4  https : / /developers . google. com / places 
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Fig.  1:  General  Paradigm  of  Training  the  criteria  importance  (users  interest  and 
location). 


participants,  we  start  by  querying  the  Google  Place  API  with  the  appropriate 
queries  corresponding  to  every  context  based  on  the  location.  This  API  returns 
up  to  60  suggestions,  thus,  we  search  again  with  different  parameters,  like  place 
types  that  are  relevant  to  the  track  (e.g.,  restaurant,  cafe,  museum,  etc.).  Beyond 
the  type  of  the  place,  we  have  also  give  the  latitude  and  longitude  as  parameters 
to  Google  places  to  obtain  results  w.r.t  the  searched  context. 

To  obtain  the  document  scores  w.r.t  the  geolocalisation  criterion,  we  compute 
the  distance  between  the  retrieved  places  and  the  context,  whereas  we  exploit 
the  cosine  similarity  between  the  candidate  suggestions  description  and  the  user 
profile  to  compute  the  user  interest  score.  User  profiles  are  represented  by  vectors 
of  terms  constructed  from  his  personal  preferences  on  the  example  suggestions. 
The  description  of  a  place  is  the  result  snippet  returned  by  the  search  engine 
Google5  when  the  URL  of  the  place  is  issued  as  a  query. 

4  Runs  and  Evaluation  Results 

The  TREC  2014  Contextual  suggestion  data  set  which  includes  the  following 
characteristics: 

—  Users:  The  total  number  of  users  is  115.  Each  user  is  represented  by  a  profile 
reflecting  his  preferences  for  places  in  a  list  of  100  example  suggestions.  An 
example  suggestion  is  an  attraction  place  expected  to  be  interesting  for  the 
user.  The  preferences,  given  on  a  5-point  scale,  are  attributed  for  each  place 
description  including  a  title,  a  brief  narrative  description  and  a  URL  website. 
Positive  preferences  are  those  having  a  relevance  judgment  degree  of  about 
3  or  4  w.r.t  the  above  features.  Ratings  of  0  and  1  on  example  suggestions 
are  viewed  as  non  relevant  and  those  of  2  are  considered  as  neutral. 

—  Contexts  and  queries:  A  list  of  100  contexts  is  provided,  where  each 
context  corresponds  to  a  particular  city  location,  described  with  longitude 
and  latitude  parameters.  Given  a  pair  of  user  and  context  which  represents 
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a  query,  the  aim  of  the  task  is  to  provide  a  list  of  50  ranked  suggestions 
satisfying  as  much  as  possible  the  considered  both  relevance  criteria. 

—  Relevance  assessments:  Relevance  assessments  of  this  task  are  made  by 
both  users  and  NIST  assessors.  The  user  corresponding  to  each  profile, 
judged  suggestions  in  the  same  way  as  examples,  assigning  a  rating  of  0  —  4 
for  each  title/description  and  URL,  whereas  NIST  assessors  judged  sugges¬ 
tions  in  term  of  geographical  appropriateness  on  a  3-point  scale  (2,  1  and  0). 
A  suggestion  is  relevant  if  it  has  a  relevance  degree  of  about  3  or  4  w.r.t  user 
interests  (profile)  and  a  rating  of  about  1  or  2  for  geolocalisation  criterion. 

In  the  present  work,  we  followed  the  guidelines  described  in  the  TREC  website 
[1]  and  we  submitted  only  one  run.  Table  1  shows  the  retrieval  performances 
obtained  using  our  operator.  From  Table  1,  we  can  see  that  the  performance 
of  our  approach  w.r.t  the  measure  P@ 5  is  about  0.21  whereas  it  is  about  0.33. 
P@ 5  measures  the  number  of  relevant  suggestions  within  the  top-5  results  for 
which  the  user  preferred  a  document  satisfying  both  the  description  and  the 
geographical  location  criteria.  As  described  in  the  track  guidelines,  the  MRR 
score  is  computed  as  1  /k,  where  k  is  the  rank  of  the  first  relevant  attraction 
found,  while  TBG  uses  the  time  for  measuring  effectiveness. 


Measures 

Run 

P@5 

MRR 

TBG 

choqrun 

0.2194 

0.3331 

0.7252 

Table  1:  Retrieval  effectiveness  of  our  approach  wrt  the  three  used  evaluation 
measures. 

5  Conclusion  and  Future  Work 

We  presented  a  novel  method  relying  on  the  Choquet  mathematical  operator 
to  tackle  the  problem  of  contextual  suggestion  using  the  Google  place  API.  In 
future,  we  plan  to  extend  the  approach  to  take  into  account  the  types  of  places 
the  users  preferred  in  their  profiles  and  try  to  apply  the  aggregation  in  this  level, 
in  order  to  learn  the  preferences  for  each  type. 
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